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The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely, 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )S Responsive to communication(s) filed on 24 January 2005 . 
2a)l3 This action is FINAL. 2b)D This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) [3 Claim(s) 1-23 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

Q)M Claim(s) 1-14 and 17-20 is/are rejected. 

7) I3 Claim(s) 15-16 and 21-23 is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10) 13 The drawing(s) filed on 30 March 2004 is/are: a)E3 accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

1 1) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 1 1 9 

12) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 119(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1. D Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
. * See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 
Response to Amendment 

1 . In response to the office action from 10/22/2004, the applicant has submitted an 
amendment, filed 1/24/2005, amending Claims 1,8, 10 and 23, while arguing to traverse the art 
rejection based on the limitation regarding the validation of word boundaries within a token 
(Amendment, Page 9). The applicant's arguments have been fully considered but are moot with 
respect to the new grounds of rejection in view of Carus et al (U.S. Patent: 6,1 85 J 24). 

2. Based on the amendments to Claim 23, the examiner has withdrawn the previous 
objections directed towards minor informalities. 

Claim Rejections - 35 USC §103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

4. Claims 1-3, 5-7, and 10-11 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Bond et al (U.S. Patent: 6,539,348) in view of Carus et al (US Patent: 6,185,524). 

With respect to Claim 1, Bond discloses: 
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Receiving the input string (Col. 3, Lines 9-10); 

Segmenting the input string into one or more proposed tokens (Col 3, Lines 21-29); 

Validating the proposed tokens by submitting the proposed tokens to a linguistic 
knowledge component to determine whether the proposed tokens represent linguistically 
meaningful units (Col 3, Lines 29-48); and 

If not, re-segmenting the input string into one or more different proposed tokens (Col 3, 
Lines 45-61). 

Although Bond teaches a process of token validation, Bond does not specifically teach 
that word boundaries within a token are validated, however Carus teaches verifying word 
boundaries (staring and ending characters of a word) to properly identify a most likely word 
candidate (Col 3, Line 31- Col 4, Line 51; Col 12, Lines 49-56). 

Bond and Carus are analogous art because they are from a similar field of endeavor in 
text parsing and word identification. Thus, it would have been obvious to a person of ordinary 
skill in the art, at the time of invention, to modify the teachings of Bond with the means of 
identifying word boundaries as taught by Carus to improve token processing efficiency through 
the use of a smaller word segment lexicon to identify words from their boundaries (Carus, Col 
1, Lines 33-41; Col 12, Lines 49-56). 

With respect to Claim 2, Bond recites: 

Accessing segmentation criteria arranged in a predetermined hierarchy of segmentation 
criteria, and segmenting based on the segmentation criteria in an order based on the hierarchy 
(Col 10, Lines 11-24). 

With respect to Claim 3, Bond discloses: 
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Accessing language-specific data containing a portion of the segmentation criteria (Col. 
7, Lines 19-23). 

With respect to Claim 5, Bond discloses: 

Validating and re-segmenting until all characters in the input string have been validated 
or until the predetermined hierarchy of segmentation criteria has been exhausted (Col. 2, Lines 3- 
11). 

With respect to Claim 6, Bond recites: 

Accessing the lexicon to determine whether it contains the proposed tokens (Col 3, Lines 

29-48). 

With respect to Claim 7, Bond discloses: 

Invoking the morphological analyzer to convert a form of the proposed tokens to a 
morphologically different form (Col 3, Lines 45-50); and 

Accessing the lexicon to determine whether it contains the morphologically different 
form of the token (Col 3, Lines 50-52). 

Claim 10 contains subject matter similar to Claims 1 and 7, and thus, is rejected for the 
same reasons. 

With respect to Claim 11, Bond recites: 
Repeating the steps of proposing a subsequent segmentation and submitting the subsequent 
segmentation to the linguistic knowledge component until the portion of the input string is 
validated or the portion of the input string has been segmented according to a predetermined 
number of segmentation criteria (Col 10 } Lines 11-24). 
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5. Claims 4, 8, 9, 12, and 17-19 are rejected under 35 U.S.C 103(a) as being unpatentable 
over Bond et al in view of Cams et al (U.S. Patent: 6,185,524), and further in view of Carus 
(U.S. Patent: 5,890,103). 

With respect to Claim 4, Bond in view of Carus et al (US. Patent: 6,185,524) teaches 
the natural language processing system utilizing a dictionary lookup-process in determining a 
proper sentence segmentation format, as applied to Claim 1 . Bond in view of Carus (US. 
Patent: 6,185,524) does not specifically suggest language-dependent punctuation data in a 
segmentation process, however, Carus (US. Patent: 5,890,103) recites: 

Accessing a precedence hierarchy of punctuation in the language-specific data, the 
precedence hierarchy being arranged based on binding properties of the punctuation in the 
precedence hierarchy, and segmenting the input string based on the punctuation in an order based 
on the precedence hierarchy (segmentation based on punctuation placement for specific 
languages, Col. 42, Lines 10-37). 

Bond and Carus are analogous art because they are from a similar field of endeavor in 
natural language processing. Thus, it would have been obvious to a person of ordinary skill in 
the art, at the time of invention, to modify the teachings of Bond in view of Carus et al (U.S. 
Patent: 6,185,524) with the use of language specific segmentation rules based on punctuation 
placement as taught by Carus (U.S. Patent: 5,890,103) to implement higher level linguistic 
processing (Carus, Col. 2, Lines 9-16) in order to prevent incorrect segmentation by identifying 
special characters (Carus, Col. 39, Lines 23-28) that could have different meanings for a specific 
language based upon character location. 
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With respect to Claim 8, Bond in view of Cams et al teaches the natural language 
processing system utilizing a dictionary lookup-process in determining a proper sentence 
segmentation format, as applied to Claim 1, while Carus (U.S. Patent: 5,890,103) teaches the 
use of language specific segmentation rules based on punctuation placement as applied to Claim 
4. 

Claim 9 contains subject matter similar to Claim 5, and thus, is rejected for the same 
reasons. 

With respect to Claim 12, Bond in view of Carus et al (U.S. Patent: 6,185,524) teaches 
the natural language processing system utilizing a dictionary lookup-process in determining a 
proper sentence segmentation format, as applied to Claim 10. Bond in view of Carus et al (U.S. 
Patent: 6,185,524) does not specifically suggest segmenting an input string at white spaces, 
however such a segmenting technique is well known and commonly used in the art as a more 
basic means of parsing input text as is evidenced by Carus (U.S. Patent: 5,890,103) (Col. 13, 
Lines 62-64). 

Bond and Carus are analogous art because they are from a similar field of endeavor in 
natural language processing. Thus, it would have been obvious to a person of ordinary skill in 
the art, at the time of invention, to modify the teachings of Bond in view of Carus et al (U.S. 
Patent: 6,185,524) with the means of segmenting an input text string at white spaces as taught 
by Carus (US. Patent: 5,890,103) to provide basic initial segmentation of an input text, based 
on white spaces, for further linguistic analysis. 

With respect to Claim 17, Cams (US. Patent: 5,890,103) discloses the means of 
segmenting an input text string at white spaces, as applied to Claim 12. 
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With respect to Claim 18, Bond additionally discloses: 

Determining whether the token contains either all alpha characters or all numeric 
characters; and if so, indicating that the token cannot be further segmented and will be treated as 
an unrecognized word (acronym containing all capital letters marked as a unknown word and 
assigned a token, Col 3, Lines 45-61). 

With respect to Claim 19, Cams (U.S. Patent: 5,890 J 03) further recites: 
Determining whether the token includes final punctuation; and if so, segmenting the 
token into a subtoken by splitting off the final punctuation (Jones', Col. 41, Lines 1-10). 

6. Claims 13 and 14 are rejected under 35 U.S.C. 103(a) as being unpatentable over Bond 
et al in view of Carus et al (U.S. Patent: 6,185,524), further in view of Cams (US. Patent: 
5,890,103), and further in view of Grefenstette (US. Patent: 6,289,304). 

With respect to Claim 13, Bond in view of Carus teaches the natural language processing 
system utilizing a dictionary lookup-process in determining a proper sentence segmentation 
format and a means of segmenting an input text string at white spaces, as applied to Claim 12. 
Bond in view of Carus does not teach the detection and segmentation of emoticons, however, 
Grefenstette discloses: 

Determining whether invalid tokens contain any of a predetermined plurality of multi- 
character punctuation strings or emoticons and, if so, segmenting the tokens into subtokens based 
on the multi-character punctuation strings or emoticons ( u smiley s, " Col. 4, Line 62- Col. 5, Line 
4)- 
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Bond, Cams, and Grefenstette are analogous art because they are from a similar field of 
endeavor in natural language processing. Thus, it would have been obvious to a person of 
ordinary skill in the art, at the time of invention, to modify the teachings of Bond in view of 
Cams with the means of parsing emoticons in a text string as taught by Grefenstette in order to 
increase natural language processing system capabilities by implementing a means for 
recognizing and segmenting emoticons which would otherwise have no meaning in a traditional 
lexicon. 

With respect to Claim 14, Carus (U.S. Patent: 5,890,103) additionally recites: 
Determining whether invalid tokens contain punctuation marks; and if so, segmenting the 
tokens into subtokens according to a predetermined precedence hierarchy of punctuation 
(detecting an apostrophe within a text string and segmenting text based on apostrophe location, 
Col 40, Lines 1-50, and Col 42, Lines 10-37). 

7. Claim 20 is rejected under 35 U.S.C. 103(a) as being unpatentable over Bond et al in 
view of Carus et al (U.S. Patent: 6,185,524), further in view of Carus (U.S. Patent: 5,890,103), 
and further in view of Malsheen et al (U.S. Patent: 5,634,084). 

With respect to Claim 20, Bond in view of Carus does not teach determining whether a 
token contains both alpha and numeric characters and segmenting a string containing such 
characters at alpha-numeric boundaries, however Malsheen suggests: 

Determining whether invalid tokens contain both alpha and numeric characters; and if so, 
segmenting the tokens into subtokens at boundaries between the alpha and numeric characters in 
the tokens (parsing syllables that can consist of a combination of letters and numbers, Abstract). 
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Bond, Cams, and Malsheen are analogous art because they are from a similar field of 
endeavor in linguistic processing. Thus, it would have been obvious to a person of ordinary skill 
in the art, at the time of invention, to modify the teachings of Bond in view of Carus with the 
ability to determine whether a token contains both alpha and numeric characters and segment a 
string containing such characters as taught by Malsheen in order to improve natural language 
processing system capabilities by parsing alpha-numeric words which would not be separated 
using conventional text processing (Malsheen, Col 2, Lines 53-59). 

Allowable Subject Matter 

8. Claims 15, 16, and 21-23 are objected to as being dependent upon a rejected base claim, 
but would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

9. The following is a statement of reasons for the indication of allowable subject matter: 
With respect to Claims 15 and 21, the prior art of record fails to explicitly teach or make 

obvious the combination of determining whether invalid tokens contain both alpha and numeric 
characters, and if so segmenting the tokens into subtokens at boundaries between the alpha and 
numeric characters with the detection of punctuation marks and emoticons or punctuation strings 
in a method for validating word boundaries within a token. 

Claims 16 and 22-23 further limit claims containing allowable subject matter, and thus, 
also contains allowable subject matter. 
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Conclusion 

10. Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1. 136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

1 1 . The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure: 

Martinez-Guerra et al (U.S. Patent: 6,523,172)- teaches a means for validating tokens. 

12. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to James S. Wozniak whose telephone number is (571) 272-7632 
and email is James.Wozniak@uspto.gov. The examiner can normally be reached on Mondays- 
Fridays, 8:30-4:30. 



Application/Control Number: 09/822,976 Page 1 1 

Art Unit: 2655 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Wayne Young can be reached at (571) 272-7582. The fax/phone number for the 
Technology Center 2600 where this application is assigned is (703) 872-9306. 

Any inquiry of a general nature or relating to the status of this application or proceeding 
should be directed to the technology center receptionist whose telephone number is (703) 306- 
0377. 



James S. Wozniak 
5/24/2005 




